Woylier: Alternative tour frame interpolation method

The woylier package implements alternative method for interpolation path between tour frames using Givens rotation.

Zoljargal Batsaikhan https://www.britannica.com/animal/quokka (Monash University) , Dianne Cook https://www.britannica.com/animal/bilby (Monash University) , Ursula Laa https://www.britannica.com/animal/bilby (BOKU University)
2022-10-06

Introduction

A projection is a tool of visualization of high dimensional data onto lower dimensions.(Buja et al. 2005) When projecting higher dimensional data onto lower dimensions, one might care about orientation of the projection in such cases projections need to be onto frames rather than planes where orientation does not matter.

We are aiming to provide alternative interpolation method that is compatible with current geodesic_path() function of tourr package. Then we would attempt to apply this interpolation method to projection pursuit of splines index to search for nonlinear associations between variables in financial data set. Finally, we would provide some example use of the package.

Background

Computational Methods for High-Dimensional Rotations in Data Visualization

The visualization of higher than 3 dimension is based on rotations of of a lower dimensional projection in high-dimensional space. Animation of these projections are one-parameter (time) family of pictures.

This paper explains algorithms for dynamic projections such as grand tours, guided tours, and manual tours.

While we can imagine rotation of 3D object, the generalization of rotation in higher than 3 dimension is quite complex. Notion of grand tour was introduced by Asimov (1985). Grand tour shows 2-D projection of higher dimensional space with no user control. Grand tour is space-filling curve in the manifold of low-dimensional projections of high-dimensional data space. Authors of this paper further explores interactivity of tours which resulted “guided tours” and “manual tours”.

The topic of this paper is the construction of paths of projections. Interpolation of paths of projection can be compared to connecting line segments that interpolate points in Euclidean space. Interpolation acts as a bridge between continuous animation and discrete choice of sequences of projections. Sequence of projections can be constructed in various ways depending on user purpose. If user wants to look at the data from all sides, a random sequence of projections can be used, which is implemented in grand tours. Furthermore, the sequence of projections can be pre-computed, data-driven, or even manually controlled.

Projection pursuit is a technique for finding data projections that are most structured according to a criterio of interest such as clustering or spread.

Authors of the paper prefer interpolation method over original “torus method” used in Asimov (1985) for projection algorithms for several reasons. One reason is that the projection paths based on torus method can be non-uniformly distributed while interpolation method is uniformly distributed by construction. Another pitfall of torus method is it causes discontinuity when user has need to change set of variables that are being viewed.

Planar rotation

A rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space in xy plane. A rotation matrix that transforms 2-D plane by an angle \(\theta\) looks like this:

\[ \begin{bmatrix}\cos \theta &-\sin \theta \\\sin \theta &\cos \theta \end{bmatrix} \]

If the rotation is in the plane of variables i and j, it is called Givens rotation.

The interpolation methods in this project are based on the composition of a number of Givens rotations that maps starting frame onto the target frame.

\[ W_z = R_m(\tau_m) ... R_2(\tau_2)R_1(\tau_1)W_a\]

Interpolating path of Frames

Frame interpolation is necessary when the orientation of the projection matters. There are several methods discussed in the paper including decomposition of orthogonal matrices, givens decomposition and householder decomposition. One that is of interest to us is Givens path.

The usage of Givens rotations comes from the fact that in any vector u one can zero out the i’th coordinate with a Givens rotation in the (i; j)-plane for any j $ $ i. This rotation affects only coordinates i and j andleaves all coordinates k \(\neq\) i; j unchanged.

Sequences of Givens rotations can map any orthonormal d-frame F in p-space to standard d-frame \(E_d=((1, 0, 0, ...)^T, (0, 1, 0, ...)^T, ...)\).

The path construction algorithm work as follows:

  1. Construct preprojection basis \(B\) by orthonormalizing \(F_z\) with regards tp \(F_a\) with Gram-Schmidt:

\[B = (F_a, F_{\star})\].

  1. Get the preprojected frames \[W_a = B^TF_a = E_d\] and \[W_z = B^TF_z\]
  2. Then we can construct a sequence of Givens rotation that maps \(W_z\) to \(W_a\):

\[ W_a = R_m(\tau_m) ... R_2(\tau_2)R_1(\tau_1)W_z\] The inverse mapping is obtained by reversing the sequence of rotations with the negative of the angles:

\[R(\tau) = R_1(-\tau_1) ... R_m(-\tau_m), \ W_z = R(\tau)W_a\] ## Limitations

Buja et al. (2004) discussed when the orientation of projection matters. If the rendering on a frame and on the rotated version of the frame yields the same visual scenes, it means the orientation does not matter.

When d=1, there will be only one dimensional projection visualized horizontally or vertically. If the projection was interpretable, the projections of left-to-right and right-to-left would be different. But in our case d=1, orientation is irrelevant because it is just linear combination of variables.

When d=2, we usually plot Cartesian scatterplot. If we consider reflected or rotated scatterplots, typical structures such as clusters, lines, curves, and outliers are recognizable without rotations. Therefore, orientation does not matter.

Introduction

Interactive data graphics provides plots that allow users to interact them. One of the most basic types of interaction is through tooltips, where users are provided additional information about elements in the plot by moving the cursor over the plot.

This paper will first review some R packages on interactive graphics and their tooltip implementations. A new package ToOoOlTiPs that provides customized tooltips for plot, is introduced. Some example plots will then be given to showcase how these tooltips help users to better read the graphics.

Background

Some packages on interactive graphics include plotly (Sievert 2020) that interfaces with Javascript for web-based interactive graphics, crosstalk (Cheng and Sievert 2021) that specializes cross-linking elements across individual graphics. The recent R Journal paper tsibbletalk (Wang and Cook 2021) provides a good example of including interactive graphics into an article for the journal. It has both a set of linked plots, and also an animated gif example, illustrating linking between time series plots and feature summaries.

Customizing tooltip design with ToOoOlTiPs

ToOoOlTiPs is a packages for customizing tooltips in interactive graphics, it features these possibilities.

A gallery of tooltips examples

The palmerpenguins data (Horst et al. 2020) features three penguin species which has a lovely illustration by Alison Horst in Figure 1.

A picture of three different penguins with their species: Chinstrap, Gentoo, and Adelie.

Figure 1: Artwork by @allison_horst

Table 1 prints at the first few rows of the penguins data:

Table 1: A basic table
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Torgersen 39.1 18.7 181 3750 male 2007
Adelie Torgersen 39.5 17.4 186 3800 female 2007
Adelie Torgersen 40.3 18.0 195 3250 female 2007
Adelie Torgersen NA NA NA NA NA 2007
Adelie Torgersen 36.7 19.3 193 3450 female 2007
Adelie Torgersen 39.3 20.6 190 3650 male 2007

Figure 2 shows an interactive plot of the penguins data, made using the plotly package.

p <- penguins %>% 
  ggplot(aes(x = bill_depth_mm, y = bill_length_mm, 
             color = species)) + 
  geom_point()
ggplotly(p)

Figure 2: A basic interactive plot made with the plotly package on palmer penguin data. Three species of penguins are plotted with bill depth on the x-axis and bill length on the y-axis. When hovering on a point, a tooltip will show the exact value of the bill depth and length for that point, along with the species name.

Summary

We have displayed various tooltips that are available in the package ToOoOlTiPs.

CRAN packages used

ToOoOlTiPs, plotly, crosstalk, tsibbletalk, palmerpenguins, ggplot2

CRAN Task Views implied by cited packages

Spatial, TeachingStatistics, TimeSeries, WebTechnologies

A. Buja, D. Cook, D. Asimov and C. Hurley. Computational methods for high-dimensional rotations in data visualization. Handbook of Statistics, 391–413, 2005. DOI 10.1016/s0169-7161(04)24014-7.
J. Cheng and C. Sievert. crosstalk: Inter-widget interactivity for HTML widgets. 2021. URL https://CRAN.R-project.org/package=crosstalk. R package version 1.1.1.
A. M. Horst, A. P. Hill and K. B. Gorman. palmerpenguins: Palmer archipelago (antarctica) penguin data. 2020. URL https://allisonhorst.github.io/palmerpenguins/. R package version 0.1.0.
C. Sievert. Interactive Web-Based Data Visualization with r, plotly, and shiny. Chapman; Hall/CRC, 2020. URL https://plotly-r.com.
E. Wang and D. Cook. Conversations in time: Interactive visualisation to explore structured temporal data. The R Journal, 2021. URL https://journal.r-project.org/archive/2021/RJ-2021-050/index.html.

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Batsaikhan, et al., "Woylier: Alternative tour frame interpolation method", The R Journal, 2022

BibTeX citation

@article{woylier_article,
  author = {Batsaikhan, Zoljargal and Cook, Dianne and Laa, Ursula},
  title = {Woylier: Alternative tour frame interpolation method},
  journal = {The R Journal},
  year = {2022},
  note = {https://doi.org/10.32614/woylier_article},
  doi = {10.32614/woylier_article},
  issn = {2073-4859},
  pages = {1}
}